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Abstract 

For a complete graph of size n, assign each edge an i.i.d. exponential variable with mean n. 
For A > 0, consider the length of the longest path whose average weight is at most A. It was 
shown by Aldous (1998) that the length is of order logn for A < 1/e and of order n for A > 1/e. 
In this paper, we study the near- supercritical regime where X = e~^ + rj with ry > 0 a small 
fixed number. We show that there exist two absolute constants Ci,C 2 >0 such that with high 
probability the length is in between ne“‘^Uand Our result corrects a non-rigorous 

prediction of Aldous (2005). 
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1 Introduction 

Let Wn be a complete undirected graph of n vertices where each edge is assigned an independent 
exponential weight with mean n; this is referred to as the stochastic mean- field (SMF„) model. 
For a (self-avoiding) path vr = (uo,ui,... ,Vm), define its length len{7r) and average weight A{'k) by 

/en(7r) = m, and A{-k) = yf E™ , 

where W(^u,v) U the weight of the edge {u,v). For A > 0, let L{n,X) be the length of the longest 
path with average weight below A, i.e., 

L{n, A) = max{/en(7r) : A['k) < A, vr is a path in SMF„ model} . 

In a non-rigorous paper of Aldous [2], it was predicted that L{n, A) x n(A — e“^)^ with fi = 3 
A e“^. Our main result is the following theorem, which corrects Aldous’ prediction. 

Theorem 1.1. Let A = 1/e -|- ry where ry > 0. Then there exist absolute constants Ci,C 2 ,ri* > 0 
such that for all rj <rf, 

lim < L{n,X) < = 1. (1-1) 

’Partially supported by NSF grant DMS-1313596. 
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The study of the object L(n, A) was initiated by Aldous [T] where a phase transition was 
discovered at the threshold e“^. It was shown that with high probability L{n, A) is of order logn for 
A < e“^ and L{n, A) is of order n when A > e“^. The critical behavior was established in [3], where 
it was proved that with high probability L{n, A) is of order (logn)^ when A is around e“^ within 
a window of order (logn)“^. 


Our Theorem 


1.1 


describes the behavior in the near-supercritical 
regime, and in particular states that L{n, A)/n is a stretched exponential in r] with t] = A —e“^ j, 0. 
Another interesting result proved in [3] states that L{n, A) > in a somewhat similar regime 
namely A > 1/e -|- /3(logn)“^, where /3 > 0 is an absolute constant. Notice that substituting 
r] = C{logn)~'^ in (1.1), we indeed get a fractional power of n. In fact our method should work, 
subject to some technical modifications, all the way down to rj = C{logn)~‘^ for a large absolute 
constant C. However, we do not attempt any rigorous proof of this in the current paper. 

A highly related question is the length for the cycle of minimal mean weight, which was studied 
by Mathieu and Wilson [9]. An interesting phase transition was found in [9] with critical threshold 
e“^ on the mean weight. Further results on this problem have been proved in [5]. It might 
be relevant to mention here that the method used in [5] could be potentially useful for nailing 
down the second phase transition detected in [3j, namely the transition from rj = a{logn)~'^ to 
r] = /3(logn)“^ where a,f3 are positive constants. 

Another related question is the classical travelling salesman problem (TSP), where one mini¬ 
mizes the weight of the path subject to passing through every single vertex in the graph. For the 
TSP in the mean-field set up, Wastlund [12] established the sharp asymptotics for more general 
distributions on the edge weight, confirming the Krauth-Mezard-Parisi conjecture |ini[IIl|8|. In¬ 
deed, it is an interesting challenge to give a sharp estimate on L{n, A) for e“^ < A < A* (here A* 
is the asymptotic value for TSP), interpolating the critical behavior and the extremal case of TSP. 
A question of the same flavor on steiner tree is given in m- 

One can also look at the maximum size of tree with average weight below a certain threshold, 
where a phase transition was proved in [T|. The extremal case of the question on the tree with 
minimal average weight is the well-known minimal spanning tree problem, where a C(3) limit is 
established by Frieze |7|. 

Main ideas of our proofs. A straightforward first moment computation as done in [T| implies 
that lim^^oo ^(^(n. A) = O(logn)) = 1 when A < 1/e (see also [H Theorem 1.3]). For A > 1/e, 
a sprinkling method was employed in [T] to show that with high probability T(n, A) = 0(re). The 
author first proved that with high probability there exist a large number of paths with average 
weight slightly above 1/e and then used a certain greedy algorithm to connect these paths into 
a single long path with average weight slightly above 1/e. However, the method in |T| was not 
able to describe the behavior at criticality. In |3| (see also [9| for the cycle with minimal average 
weight), a second moment computation was carried out restricted to paths of average weight below 
1/e and with the maximal deviation (defined in (3.2) below) at most O(logn), thereby yielding 
that with high probability L{n, 1/e) = 0((logn)^). A crucial fact responsible for the success of the 
second moment computation is that the length of the target path is 0((logn)^) <C ^/n. As such, a 
straightforward adaption of this method would not be able to succeed in the regime considered by 
this paper. 

TSP, where one studies paths (cycles) that visit every single vertex, is in a sense analogous to 
the question of finding the minimal value A for which L{n, A) = n with high probability. Wastlund 
|12| showed that the minimum average cost of TSP converges in probability to a positive constant 
by relaxing it to a certain linear optimization problem. But it seems difficult to extend his method 
to “incomplete” TSP i.e. when the target object is the minimum cost cycle having at least pn many 
edges for some p G (0,1). Since our problem is in a sense dual to incomplete TSP in the regime we 
are interested in, the method of [l2| does not seem to be suitable for our purpose either. In the 
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current work, our method is inspired by the (first and) second moment method from [H [9] as well 
as the sprinkling method employed in [T]. 

In order to prove the upper bound, our main intuition is that if L{n, A) were greater than 
g-C' 2 /then we would have a larger number of short and light paths (a light path refers to 
a path with small average weight — at most a little above 1 /e) than we would typically expect. 
Formally, let ^ where ci is a small positive constant, and consider the number of paths (denoted 

by ./Vr;/ci,c 2 ) with length £ and total weight no more than X£ — C 2 \/^ for a positive constant C 2 . We 
call such a path a downcrossing. A straightforward computation gives ,,2 = 0{l)n£e~^^/^ 

for a positive constant C3 depending on ci and C2. Now we consider the number of paths (denoted 
by Ns) of length 5{X)n and average weight at most A. Such paths have two possibilities: (1) The 
path contains substantially more than EA^/ci,c 2 many downerossings, which is unlikely by Markov’s 
inequality. (2) The path does not have substantially more than EA^/ci,c 2 many downer ossings. This 
is also unlikely for the following reasons: (a) A straightforward first moment computation gives 
that EA^ = for a constant C 4 > 0; (b) The number of downcrossings along a path of 

this kind, or a random variable that is “very likely” smaller, should dominate a Binomial random 
variable Bin((in/£, C5) where C5 > 0 is an absolute constant (since in the random walk bridge, every 
subpath of size I has a positive chance to have such a downcrossing). If we choose b suitably 
large as in Theorem [H we are suffering a probability cost for the constraint on the number of 
downcrossings (probability for a binomial much smaller than its mean) and this probability cost 
is of magnitude for a constant ce > 0 depending in ci. If we choose ci small enough this 

probability cost kills the growth of in EA^. Therefore, paths of this kind do not exist either. 

The details are carried out in Section [U 

For the lower bound, our proof consists of two steps. In light of the preceding discussion, we 
cannot hope to directly apply a second moment method from mi to show the existence of a light 
path that is of length linear in n. As such, in the first step of our proof we prove that with high 
probability there exists a linear (in n) number of disjoint paths, each of which has weight slightly 
below A and is of length for an absolute constant cy > 0. This is achieved by two second 

moment computations, which are expected to succeed as the length of the path under consideration 
is <C '/n (indeed remains bounded as n —)> 00 ). In the second step, we propose an algorithm which, 
with probability going to 1 , strings together a suitable collection of these short light paths to form a 
light path of length for an absolute constant cs > 0. Our algorithm is similar to the greedy 

algorithm (or in a different name exploration process) employed in [T]- But in order to ensure that 
the additional weight introduced by these connecting bridges only increases the average weight of 
the final path by at most a multiple of rj, we have to use a more delicate algorithm. The details 
are carried out in Section |3l 

Notation convention. For a graph G, we denote by V{G) and E{G) the set of vertices and edges 
of G respectively. A path in a graph G is an (finite) ordered tuple of vertices (uo,ui,--- ,Vm), 
all distinct. For a path vr = (uo,ui,--- ,Vm), we also use vr to denote the graph whose vertices 
are vo,vi,--- ,Vm and edges are (uQ)'*^ 1)5 •'' , {vm-i,Vm)- This would be clear from the context. 
The weight of an edge e in Wn is denoted by We and we define the total weight W (vr) of a path 
Y^e&Ei-K) The collection of all paths in Wn of length £ G [n] is denoted as II^. We let 
A = 1/e + ?7 where 7 / is a fixed positive number. A path is called X-light if its average weight is at 
most A, and a path is called (A, G)-light if its total weight is at most X£ — G^/I where £ is length 
of the path. For nonnegative real or integer valued variables xo,xi, ■ ■ ■ ,Xn, let 5 be a statement 
involving xq, xi, • • • , Xn- We say that S holds “for large xq (given xi, • • • , Xn)” or “when xq is large 
(given xi, • • • ,Xn)” if it holds for any fixed values of xi, • • • , Xn in their respective domains and 
Xq > clq where oq is some positive number depending on the fixed values of xi, • • • ,Xn. In case 
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ao is an absolute constant, the phrase “(given xi,--- ,Xn)” will be dropped. We use “for small 
xo” or “when xq is small” with or without the qualifying phrase “(given xi,X 2 , • • • , Xn)” in similar 
situations if the statement S holds instead for 0 < xq < qq. Throughout this paper the order 
notations 0(.), 0(.), o(.) etc. are assumed to be with respect to n —>■ oo while keeping all the other 
involved parameters (such as i, rj etc.) fixed. We will use Ci, C 2 , ... to denote constants, and each 
Ci will denote the same number throughout of the rest of the paper. 

Acknowledgements. We are grateful to David Aldous for very useful discussions, and we thank 
an anonymous referee for a careful review of an earlier manuscript and suggesting a simpler proof 
of Lemma 13.81 


2 Proof of the upper bound 


Let r]' be a multiple of r/ by a constant bigger than 1 whose precise value is to be selected. Set 
i = [1/VJ let N^i be the number of “(A, l)-light” paths of length i. We assume rj < 1 so that 
i>l. As outlined in the introduction, we shall first control N^/. 

It is clear that the distribution of the total weight of a path of length k follows a Gamma 
distribution r(A:, 1/n), where the density fe,k{z) of Gamma(/i;, 6) is given by 

fd,kiz) = 6^/{k — 1)! for all 2 ; > 0,0 > 0 and /c G N. (2.1) 


By (2.1) and the Stirling’s formula, we carry out a straightforward computation and get that 
EA^/ = (1 + 0 ( 1 )) X X E^Gamma(£, 1/n) < M — 

£+1 VIY 


= (1 + 0(1)) X 


X 




= (1 + o(l))Co(r?)ae'^’'/^'v^e-"/^n, 


( 2 . 2 ) 


where Cq^t]) —>• 1 as 7 / —>• 0, and a is a positive constant. Furthermore the factors 1 + o(l) are 
strictly less than 1. 

We also need a bound on the second moment of to control its concentration around EA^/. For 
7 G n^, define F/ to be the event that 7 is (A, l)-light. Then clearly we have A^/ = 
order to compute E(A^/)^, we need to estimate E(F.ynTy) for 7,7' G II^. In the case E{'^)r\E{‘^') = 
0, we have and F^/ independent of each other and thus P(T^/|Ay) = E(F.y'). In the case 
1 ^( 7 ) n E{'^')\ = j > 0 , we have 


P(Fy|F.^) < E(Gamma(£-j, 1/n) < A^) < . 


(2.3) 


Further notice that if |Fi( 7 ) n E{^')\ = j, then |A( 7 ) n A( 7 ')| is at least j + 1 as 7 n 7' is acyclic. 
So given any 7 G the number of paths 7' such that \E{'y) n E{'y')\ = j is at most 0{n^~^). 
Altogether, we obtain that 

= E^,yen/(^7 n Fy) = E^en/(^7)Ey6n/(^7'|i"7) 

1 {xey-F 


<E,£nP(n) E p(G')+E E 


7':E(7'n7)=0 


E«n,nn)( E E 


7':i?(7'n7)=0 


1<1<^ 7':|^^(7'n7)l=J 

0{n^-y {xey-^ 

(£ — j)\ n^A 


{£ — j)\ n^A 


i<j<t 


< E7en/(^7) + 0(1)) = EiVr,' + 0(1)) • 


(2.4) 
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Since = ^2(1) as implied by (2.2), (2.4) yields that 


EN^, = {EN^,f{l + o{l)). (2.5) 

As a consequence of Markov’s inequality (applied to |Ai^/ — EA'^/p), we get that 

E{Nr,' > 2ENr,>) = o(l). (2.6) 

Next, we set out to show that any long A-light path should have a large number of subpaths 
which are (A, l)-light. Let vr be a path of length 6n for some 5 > 0. Denote its successive edge 
weights by Ai, A 2 ,... X^n and let for 1 < k < 6n. Probabilities of events involving 

edge weights of vr, unless specfically mentioned, will be assumed to be conditioned on “{A(7r) < A}” 
throughout the remainder of this section. Now divide vr into edge-disjoint subpaths of length i (with 
the last subpath of length possibly less than i in the case i does not divide 6n) and denote the A:-th 
subpath by for 1 < A: < 6n/i. Call any such subpath a downcrossing if it is (A, l)-light. Let 
^kri' n event that 6^ is a downcrossing. The following well-known result about exponential 

random variables (see, e.g., [3l Theorem 6.6]) will be very useful. 

Lemma 2.1. Let VLi, IL 2 ,..., VLat be i.i.d. exponential random variables with mean 1/9, and let 
Sk = for 1 < k < N. Then the random vector (^,..., ) follows Dirichlet(ljv) 

distribution, Sn follows Gamma(A;0) distribution, and they are independent of each other. Here 
lisf is the N-dimensional vector whose all entries are 1. 


We will also require the following simple lemma which we prove for sake of completeness. 


Lemma 2.2. Let Z\, Z 2 , ■ ■ ■, Zj^j be i.i.d. exponential random variables with mean 1 and let Sjy = 


Ylf=iSN- Then 


E{Sn >N + a)< for all 0 < a < {2 - V2)N , 

E{Sn <N-a)< , for all a > 0 . 


(2.7) 

( 2 . 8 ) 


Proof. By Markov’s inequality, we get that for any a > 0 and 0 < 0 < 1, 

E{Sn >N + a)= P(e®^^ > e®(^+")) < 


When 9 < 1 — l/y/2, the right hand side is bounded above by e^^^ So setting 9 = a/2N yields 
(2.7) as long as 0 < a/2N < 1 — I/V 2 . One can prove (2.8) in the same manner. □ 


As hinted in the introduction, let us begin with the intent to prove that the number of down- 
crossings along the first half of vr (or any fraction of it) dominates a Binomial random variable 
liln{5n/2^,p) for some positive, absolute constant p. So essentially we need to prove that a sub¬ 
path 6^ can be a downcrossing with probability p regardless of the first {k — l)i edges of vr that 
precede it. Now conditional distribution of X(^k-i)i+i-:^{k-i)i+ 2 i ''' > ^&n given Ai, A 2 , • • • , X(^k-i)i 
and A(vr) < A is essentially the the distribution of X(^k-i)i+i-:^{k-i)i+ 2 i ''' 

X)i=(fc-i)£+i ^ “ S(k-i)i- Oil the other hand we get from Lemma 


2.1 


conditioned on 
that conditional 


mean and variance of W{b'ffj given Ssn — S(^k-i)i = fk{dn — {k — 1)^) are ixi and /i^^(l -|- o(l)) 
respectively for all // > 0 and k < bn/2. Hence it is plausible to expect that probability of the 
event {W{bf) < Ak{i — CVi)} conditional on any set of values for Ai, A 2 , • • • ,X(^k-i)e is bounded 
away from 0 for large i and n, where Ak = Xf = {Ssn — S(^k-i)i)/~ ~ T)i) and C is some 
positive number. Let us denote the event {W{b'ff) < Ak{i — by A^'^, Thus it seems more 
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immediate to prove the stochastic domination for number of occurrences of ^’s which, for the 
time being, can be treated as a “proxy” for the number of downcrossings. The formal statement is 
given in the next lemma where we use 6 as the value of C since this allows us to avoid unnecessary 
named variables and also suits our specific needs for the computations carried out at the end of 
this section. 

Lemma 2.3. Let N^, ^ be the number of occurrences of events Af.^, ^ = A^'^, ^ for 1 < k < dnf2i. 
Then for any 0 < t]' < tjq where % is a positive, absolute constant and any 0 < (fo < 1 there 
exists a positive integer Ud = nd(So, rj') and an absolute constant c > 0 such that for all 6 > 6o and 
n> Ud the conditional distribution of N^, ^ given {^(vr) < A} stochastically dominates the binomial 
distribution Bin((5n/2£, c). 

Proof. Notice that it suffices to prove that there exist positive absolute constants £o,c such that 
uniformly for ^ > 0, i > Iq and large L (given i) 

^{Se< ^{£-6 VI)\Sl = >c. 


To this end, we see that for L > i 

IP(5£ < - 6VI)\Sl = /uL) = P(|^ <{£- 6V~i)/L\SL = t^L) = P(|f < {£ - 6V£)/L) , (2.9) 


' Sl 


where the last equality follows from Lemma 2.1 Since distribution of ^ does not depend on the 


mean of the underlying Xj's, we can in fact assume that Xj’s are i.i.d. exponential variables with 
mean 1 for purpose of computing (|2.9|). By (|2.8|), we have 


^{Sl/L < 1 - 1/{2V£)) < . 


So for £ — 6\/£ > 0, we get 

HSe <^{£- 6 \/^)) > r{Si <£- 6.5V£) - ( 2 . 10 ) 


By central limit theorem there exist absolute numbers £o,c' > 0 such that ^{Si < £ — 6.5VI) > d 
for £ > £o. Hence from (2.10) it follows that for any £ > £o there exists Lq = Lq{£) such that the 
right hand side of (2.9) is at least c = 0.99c^ for L > Lq. □ 


Now what remains to show is that the number of downcrossings N^, ^ along vr is bigger than 
n high probability. Notice that the occurrence of A^^, ^ \ ^ implies that A^. must 

be “significantly” above A. But that can only be caused by a substantial drop in Sk for some 
1 < /c < (5n/2, an event that occurs with small probability. 


Lemma 2.4. Denote by ^ the event that Ak is more than A + \/f/ for some 1 < k < ^. Then 
for any 0 < r/' < 1/4 and 0 < (5o < 1 there exists a positive integer Ug = ns{6o, rj') such that, 


,^1 A(7r) < A) < 2ne for all 6 > 6o and n> ng. (2-11) 


Proof. For 1 < k < 5n/2£, let £k = {k — \)£, Ug = |'2.^/5ol E'^^, ^ = {A^ > A + y/rf}. Assume 

n > Ug so that 6nj2£ > 1. On E'^^, we have 

^ ^ £kS5nl^n - y/ff{5n - 4 ) ^ _4 _ \/rf{5n - 4 ) 

Ssn ~ Ssn ~ Sn 6n 
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where the last inequality holds since we are conditioning on Ssn < and A < 1 when rj' < 1/4 
(recall that r] < rj'). Therefore, we get 


< ^) < < ^(4 - Vv'iSn - 4 ))) 


( 2 . 12 ) 


Now we evaluate the right hand side of (2.12). Analogous to (|2.9|) in the proof of Lemma |2.3l we 


can assume without loss of generality that Ai, A 2 , ■ ■ ■ are i.i.d. exponential variables with mean 
1. It is routine to check that 


(1 + \/t7/2) X (4 — \A7 {Sn — 4 )) (-k — \/if6n/4 , for all 1 < A: < (In/2.^. 

Thus, for all 1 < A: < 6nj2i we get 

1P(% < §^(4-vV(<5n-4))) < IP(S4<4-\A75n/4)+IP(^>l + vV/2) 

< Q-^nri'/ie _|_ ^-Snrj'/IG^ 


where the second inequality follows from (2.8) and (2.7) respectively. Combined with (2.12), it 
gives that 

P(£;^^^,_„|A(7r) < A) < , for all 1 < A; < 6n/2i. 

An application of a union bound over k completes the proof of the lemma. □ 


Proof of Theorem 1.1: upper bound. Assume that p' < 1/4 A r/o where po is same as given in the 

Fix a 4 = 4(V) in (0,1) and let no = no(4, V) = ^^(4, V) V n44, V), 


2.3 


statement of Lemma 

where n^, Ug are as stated in Lemmas |2.3| and |2.4| respectively. In the remaining part of this section 
we will assume that n > uq and 5 > 4; so that Lemmas |2.3| and |2.4| become applicable. Now let vr 
be a path with length 6n. From Lemma [2. 4| we get that with large probability A^ < A + ^/rf for all 
k between 1 and 6nj2i. But it takes a routine computation to show that \ {A^ < A + C 

Consequently Lemma ! 


^n,k,r]' when 7]' is small. Thus > N^, 


n',n except on 


2.3 


allows us to 


use binomial distribution to bound quantities like P(A^, ^ < x) with a “small error term” caused 


by the rare event Formally, 


< 2EA^/A(7r) < A) < P( < 2EA^/A(7r) < A) + P( ^^.,„|A(7r) < A 


< 


< 2EA^/|A(7r) < a) + , 


where the last inequality follows from Lemma 2.4 Therefore, by Lemma 2.3, we get that 
< 2EA'^/|A(7r) < a) < p(Bin((5n/2^, c) < 2EA^/) + 2ne"^"^'46 . 


(2.13) 


Next let us define a new event as 

^ri,5o,n = UA:>(5on UTrSUfe — 2EA^/, A(7r) < A}. 

So Er^^So,n is the event that there exists a A-light path vr with Zen(7r) > 5on and which contains 
at least 2EA^/ many downcrossings. Thus occurrence of 3.^1,So,n implies that N^i > 2'EN^r which 
has small probability owing to (2.6). On the other hand if Eri^So,n does not occur, L{n,X) > Squ 
implies the existence of a A-light path of length at least Squ that has no more than 2EA^/ many 
downcrossings. Formally, 
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(2.14) 


P(L(A, n) > Jon) = P(E;^, 5 o,„) + P({L(A, n) > Jo?^} \ '^v,So,n) 

< F{N,, > 2EiV,,) +p(u>5o„ < 2EiV,,, Al(7r) < A}) 

< 0(1) + Ek>5on < 2EN,,\A{7r) < A)E(A(vr) < A) . 


Now choose Jq = ^oiv') such that 

5onri'cl4: = 2KN^i . 

Since 1/i > rj', we then get from Binomial concentration that for J > Jq) 

F(Bm{5n/2£,c) < 2EN'^/) < ^-^^v'cVie _ 


(2.15) 


Plugging this into (2.13) we have 

< 2EiV^/|^(7r) < a) < ^ 

whenever len{7r) > Jgn. A straightforward computation using (2.1) yields 

< A) < ^e-‘^ 

The last two displays and ( |2.14 ) together imply that 

P(i(A,n) > Son) < o(l) + . (2.16) 

Setting rj' = 32erjjc^ we get from ( |2.16[ ), 

E(L(n, A) > Jo^) = o(l). 

It remains to be checked whether Jq obtained from (|2.15 ) has the correct fnnctional form as in 
(1.1). To this end recall from (2.2) that 

2EA'^, < 3ae®^/'?'v^e-®/^n, 

where rj is small enongh so that Co{r]) in (2.2) is less than 3/2. Hence Jq < for some 

absolute constant C 2 when r] is small. 

□ 


3 Proof of the lower bound 

3.1 Existence of a large number of vertex-disjoint light paths 

As we mentioned in the introduction, the proof of lower bound is divided into two steps. In the 
first step we split the vertices into two parts and show that there exist a large number of short 
(i.e. of 0(1) length) vertex-disjoint A-light paths containing vertices from only one part. In the 
second step we use vertices in the other part as “links” to connect a subcollection of the short 
paths obtained from step 1 into a long (i.e. of 0(n) length) and light path. The current and next 
snbsections are devoted to these two steps in respective order. 



















In light of the preceding discussion, let us hrst select a complete subgraph W* of Wn containing 
n* = n^-r]Xi = (1 “ Ci^)™ vertices where r]Xi £ (0,1). To be specihc we can order the vertices of 
Wn in some arbitrary way and dehne W* as the subgraph induced by “hrst” n* vertices. It will be 
shown that there are substantially many short and light paths that can be formed with the vertices 
in V{W*). We will in fact require slightly more from a path than just being A-light. For vr G 11^ 
and some C 2 > 0, dehne 


Gn = = {a^ - 1 < W{7r) < Xi,M{7r) < (C2/V^).(VF(7r)/A^)} , (3.1) 

where M{'k) is the maximum deviation of vr away from the linear interpolation between the 
starting and ending edges, formally given by 

M(7r) = sup I | . (3.2) 

l<k<t. 


A similar class of events were considered in mi in order for second moment computation. As the 
authors mentioned in these papers, the factor lF(7r)/A£ provides some technical ease in view of the 


following property which is a consequence of Lemma 2.1 

P(M(7r) < {(, 2 /^/^)-{W{tt)/M) I VF(7r) = w) = constant for all u) > 0. 


(3.3) 


Call a path tt G II^ good if G-,^ occurs. Since we are only interested in good paths whose vertices 
come from C(>V*), we need some related notations. For ^ G N, denote by n| = II|.^^^ the set 
of all paths of length I in W* and by the total number of good paths in n|, i.e., 

= X^TTsn* In order to carry out second moment analysis of we need to control the 

correlation between and 1^^, where vr, vr' G n|. It is plausible that such correlation depends 


on the number of common edges between tt and vr' and in fact bounding the correlation in terms of 
the number of common edges was sufficient for proving (2.5) in Section]^ But in this case we need 
an additional measurement instead of just \E{'k) n E{'k')\. This is discussed in detail in m E] and 
some of their results will be used. Let vr be a path in n| and S C E{tt). A segment of vr is called 
an S'-component or a component of S if it is a maximal segment of tt whose all edges belong to S. 
Notice that S'-components can be dehned solely in terms of S. For two paths tt and vr', dehne a 
functional 9{tt,tt') to be the number of 5-components where S = E{tt) n E{tt'). As tt and tt' are 
self-avoiding, ^(vr, vr') is basically the number of maximal segments shared between tt and tt' . We 
refer the readers to Figure for an illustration. 

The following result m Lemma 2.9]) relates cardinality of V{S), the union of all endpoints of 
edges in 5 = E{tt) n E{tt'), to 9{tt,tt') and |5|. 


|B(5)| = |5|+0(7r,7r'). 

The pair [9{tt, tt'), |Fl(7r)n£'(7r')|) turns out to be sufficient for bounding the correlation between 
and from above. Consequently it makes sense to partition n| based on the value of this 
pair. More formally for vr G n| and integers i < j, dehne the set Aij as 

Aij = Aij{TT) = {tt' G li} : 9{tt,tt') = i, \E{tt) n E{tt')\ = j}. (3.4) 

We need a number of lemmas from [3] . 

Lemma 3.1. Lemma 2.10]) For any 1 < £ < n* and any vr G we have that for any positive 
integers i < j 

|Al,,(vr)| < ('+/) T(^ + 1 - j)! < 
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- E(rtU7t’)\S 



V, V2 V3 V4 V5 Vg V7 Vg Vg 


Figure 1 - Components of the set of edges common to two paths. In this figure 
the sequences of vertices vi,V2,V3,V4,V5,ve,VT,vg,TVg and v[, V2, v'^, V4,v'^, vq, vj, vg,, Vg 

define the paths tt and tt' respectively. The dark edges belong to S' = i?(7r) ni?(7r'). Here 
0(7r,7r') = 2 with the segments (^^3,^4) and {vg,VT,Vg) being the two S-components. 


Lemma 3.2. ( ^ Lemma 2.3]) Let Zi he i.i.d. exponential variables with mean 9 > 0 for 1 < i < L 
For 1/4 < p < 4, consider the variable 

Ml = sup I - pk\ ■ (3.5) 

l<k<£ 

Then there exist absolute constants c*, C* > 0 such that for all r > I and £ > r'^, 

^-cn/T^ < P(M£ < r I = P^) < . 

Lemma 3.3. Q Lemma 3.2] Let Zi he i.i.d. exponential variables with mean 9 > 0 for i G N. 
Consider 1 < r < \// and the integer intervals [oi, 61 ], [ 02 , 62 ], •'' > ^m] such that 1 < oi < 61 < 

a 2 <■■■< am ^ bm < £ and q = — ai + 1) < £ — 1. Let 1/4 < p < 1 and Mi be defined as 

in the previous lemma. Also write A = n N and pi = F{Mi < r \ = P^)- Then 

for all Zj such that 

- p{bi - a* + 1 ) < 2r , 

we have 


< = P^-, Zj = ^3 for all 3 ^ A) < Csr-s/q A {£ - 


(3.6) 


where C* is the constant from Lemma\3f^ and 6*3 > 0 is an absolute constant. 


Remark 3.4. (1) Notice that the bounds in Lemma 3.2 and 3.3 do not depend on the particular 


mean of Zfs due to Lemma 2.1 (2) Although the bounds on pi in Lemma 3.2 do not contain any 


p (as it was restricted to a bounded interval), pi actually depends on r only through the ratio r/p. 


IS same 


This follows from an application of Lemma 2.1 with little manipulation. (3) Lemma 3.3 
as Lemma 3.2. in |1] except that in the latter q is restricted to be at most £ — lOr. But we can 
easily extend this to all g < .£ — 1. To see this assume £ — 1 > q > £ — l Or. Then the right hand 

we get pie^*^A^ > 1. So 


side in (3.6) becomes at least CzPicP e A _ Now from Lemma 


3.2 


the right hand side in (3.6) is bigger than whenever £ — 1 > q > £ — lOr. Increasing C 3 if 


necessary we can make this number bigger than 1 and thus Lemma 3.3 follows 
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By second moment computation, we can hope to show that ~ with high probability. 
Then the main challenge is to prove that a large fraction of the good paths are mutually vertex- 
disjoint with high probability. To this end, we consider a graph On where each vertex corresponds to 
a good path in and an edge is present whenever the corresponding paths intersect at one vertex 
at least. Thus the presence of a large number of vertex disjoint good paths in W* is equivalent 
to the existence of a large independent subset (i.e., a subset that has no edge among them) in the 
graph On- The following simple lemma is sometimes referred to as Turan’s theorem, and can be 
proved simply by employing a greedy algorithm (see, e.g., 0 ). 


Lemma 3.5. Let G = (y,E) be a finite, simple graph with V Then G contains an independent 
subset of size at least |yp/(2|ill| -|- |1^|). Notice 2\E\ is the total degree of vertices in G. 


In light of Lemma [3. 5 1 we wish to show that with high probability the total degree of vertices in 
On is not big relative to \ V{0n)\- For this purpose, it is desirable to show that the typical number 
of good paths that intersect with a fixed good path vr G is not big. Thus, we need to estimate 
Z^Tr'en* T(G.n-/|G 7 r) where is the collection of all paths vr' in W* sharing at least one vertex 




with vr. Drawing upon the discussions preceding (3.4), we will first estimate E(G^/|G'^) for a specific 
value of the pair \E{tt) n E(tt')\). Our next lemma is very similar to Lemma 3.3 in [4]. 


Lemma 3.6. Let vr G n| and tt' G Aij with 1 < i < j < i- Then there exist absolute constants 
r/i, 6*4 > 0 such that for 0 < 7 y<??i, C 2 > 1 V ^2G* je and i> (, 2/11 we have 


F{G^,\G^) < C 4 (l + (3 7) 


Proof. Denote by S and S' the sets E{'k) n E{tt') and E{'k') \ E{tt) respectively. By standard 
calculus, there exists 0 < r/i < 1 such that 1 + ery > for all 0 < r] < rji. Note that 

I Gn) = Pi • p 2 , where 


pi = ¥{Xi - 1 < IT(7r') < A£ | G ^), 

P 2 = E(M(7r') < (C2/V^).(IF(7r')/A^) | A£ - 1 < IT(7r') < A£) . 

Since the maximum deviation of a good path from its linear interpolation between starting and 
ending edges is at most the weight of an S'-component, say s, is at least W{Tr)\s\/£ — 2(2/-fiil 

when vr is good. Here |s| denotes the number of edges in s. Adding over all the 9{Tr, vr') components 
of S we get that — 20(7r, 7r')((2/v^ on Gt^. As vr' G Aij and weight of a good 

path is at least A^ — 1 , the previous inequality implies that on G^^, 


> Aj - 1 - 2iC2l^fin- 

Consequently when j < £ — 1, 

Pi < F(J2eeS'We<MS'\ + l + 2iC2/Vv\G^) (3.8) 

= E(Gamma(t' — j, 1/n) < A{£ — j) + 1 + 2iC2/\A/) 

< C'4n-^^-^\£-j)-^/\i + er]Y-j^2ieC2/Vv{i+ev)^ ( 3 , 9 ) 


where 6*4 > 0 is an absolute constant and the last inequality used ( 2 . 1 ). 
the right hand side of (|3.8l), we can apply (13.3^ and Lemma |3.3|to obtain 


For the second term in 


p 2 < G3E(M(7r) < C2/Vh I = A/) Vj A {£ - ^ ( 3 . 10 ) 
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when j < I — 1 and £ > Ci/v conditions in Lemma 3.3). Using (3.3) again, we get that 


< C 2 /VV I W(7r) = \i) = P(M(7r) < (C2/\A/).(hL(7r)/A^) |A^ - 1 < tU(7r) < Xl) 

= ¥{G^)/¥{\£ - 1 < tU(7r) < \£) 

= P(G 7 r)/lP(At' — 1 < Gamma(f, 1/n) < Xi) 

< C'l{l + o{l))F{G^)£\{n/X£Y, 

where C 4 > 0 is an absolute constant and the last inequality follows from Plugging the 

preceding inequality into (3.10) and using the fact £l < e\/£(t'/e)^ (Stirling’s approximation) 

p2 < eC3C^'(l + o(l))P(G^)nV'^(^-i)/^(l + er?)-^10^°°*^2/V»?gCLVC| . 


Combined with (3.9), it yields that 

P(G'^'IG^) < eC3C^C^'C2(l + o(l))P(G^)v^n^'(l + er/)-no^™*^2/V»igC*i7?/c|g2i<2/V^?(i+e^) ^ 

Since (2 > Y^2C*/e and rj < rji we have 

F{G^'\G^) < + 

provided j < £ — 1. The case j = £ can also be easily accommodated. To this end let us first 
compute P(G 7 r). It follows from (2.1) and Lemma 3.2 that 

IP(G^) > (1 + o(l))(l - e-^/^)(A£/n)^(l/£!)e-^‘^'^/«2 . 

Applying Stirling’s formula again, we get that for C 2 > Y^2C*/e and rj < r]i, 

IP(G^) > G'^'il + oil))n-^£-^/^e^'^ , 

for an absolute constant > 0. Hence, with the choice of C 4 = IjG'f V eGsC'^G'^ the right hand 
side of (3.7) is at least 1, and thus (3.7) holds in this case. 


□ 


Armed with Lemma 


3.6 


we can now obtain an upper bound on X^Tr'en* IP’(G^ 7 r'|G' 7 r)- Similarly 

£, 7 r 

we can bound X^Tr'en^ which is useful for the computation of E((A'^*)^) in view of the 

following simple observation: 

E((iV;)2) = = rnDYl^'enfiG^G^) , 

where the last equality follows from the fact that X^Tr'en* ^{G-w’\G-n) is independent of vr. 


(3.11) 


Lemma 3.7. Let 0 < Ci < 1/4 and let C, 2 ,£,'d satisfy the same conditions as stated in Lemma 3.6. 
Then there exists an absolute constant Cs > 0 such that, 

nG.'\G^) < ^ 5(1 + o(l))eioooCUV^^/^^ , (3.12) 

i,Tr 

En'en^/{Gn'\G.) < {1 + o{l))ENf . (3.13) 

Proof. By Lemmas |3.6| and |3.1[ we get that for 1 < i < j < £, 

E.'^a,MG.^G^) < + < (l + o(l))EAr; gfa^;^^-’^^) , (3.14) 
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where ^{r], i,j, Ci) is a number depending only on ( 77 , i, i,j, ^i) (so in particular, ^( 77 , i, j, Ci) does 
not depend on n). It is also clear that 




Combined with (3.14), it yields (3.13). It remains to prove (3.12). To this end, we note that the 
major contribution to the term ETr'en* comes from those paths vr' with 9{tt, tt') = 1 or 

|C(7r') nC(7r)| = 1. Thus, we revisit (3.14) for the case of i = 1. By Lemmas 3.6 and 3.1 again, we 
get that 

Ei<j<f Eaij nOAG.) < 2^4(1 + 

< 2^4(1 + - e"i))“^ 

< 8 ^ 4(1 + o(l))e^°“^2/V^^£7/^3Eiv;77-^, (3.15) 

where the last two inequalities follow from the facts that Ci < 1/4 and e“^/^ < 1 — 77/4 whenever 
0 < 77 < 1. We still need to consider paths that share vertices with vr but no edges. For 1 <i < 
define Bi to be the collection of paths which shares i vertices with vr but no edges, i.e., 

Bi = {tt' e n| : |I/(7r') n C(7r)| = 7, E{-k') n E{tt) = 0} . 

We need an upper bound on the size of Bi. To this end notice that there are many choices 

for I4(7r') n I4(7r) as cardinality of the latter is i and these vertices can be placed along tt' in at 
most many different ways. Also the number of ways we can choose the remaining ^ +1 — i 

vertices is at most Multiplying these numbers we get 

Since the edge sets are disjoint, = P(G.n-) for all tt' G Bi and 1 <i < L So we have 

E.^6 bP(G./G.) < (l + o(l))(Y)'i!(l-Ci^?)"*^ < (8 + o(l))f’2^. (3.16) 

Combined with (3.15), it completes the proof of (3.12). □ 


We will now proceed with our plan of finding a large independent subset of Qn- For any two 
paths TT and vr' in 11 ^, define an event 




7r,7r';?7,C2 


G^ n if C(7r) n C(7r') / 

0 , otherwise. 


Writing N[ = = E 7 r, 7 r'en| we see that N[ = 2 |F;(^„)| + |I4(^n)|- Also notice that 

= \V{Qn)\- As an immediate consequence of Lemma 3.7, we can compute an upper bound of 
EiV/ as follows: 

< 0,(1 + (3.17) 

If A/ and N'^ are concentrated around their respective means in the sense that A/ = EA/(l + o(l)) 
and A^ = EA/(1 + o(l)) with high probability, then we can use Lemma 


3.5 


and (3.17) to derive a 

lower bound on the size of a maximum independent subset of Qn- For this purpose, it suffices to 
show that E((A^*)^) = (EA^*)^(1 + o(l)) and E((A^)^) = (EA^)^(l + o(l)). The former has already 
been addressed by (3.13) (see (3.11)). For the latter we need to estimate contributions from terms 
like E(A.n-^^ 7 r 2 C in the second moment calculation for A^. Our next lemma will be useful for 

this purpose. 
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Lemma 3.8. Let vri, 7r2 , tts, 7 r4 be paths in n| such that \E{tt 3 U 'Ki)\ = 2£ — j and \E{'Ki U 7 : 2 ) H 
E^tts U TTi)\ = j' where 0 < j < ^ and 1 < f <21 — j. Also assume that V^tt^ n 714 ,) / 0. Then, 

|y(7r3) n y(7r4)| + |L(7r3 U 774 ) n F(7ri U 7r2)| >j + 3 ' + 2 . (3.18) 


- £(7130714) 

— E(7r4)\E(7i3) 

- • E(7t3)\E(7l4) 


Vi 


V2 V3 



V 4 

■> 


V5 Vg V7 Vg V9 
<- > 

C° 





v, V2 V3 V4 Vg Vg V7 Vg Vg 


«■ 


■» 


<r 


C 


o 

2 


■» 


Figure 2 - Removing edges from union of two paths. In these figures the sequences 
of vertices vi,V2,V3,V4,V5,ve,vr,vs,vg and v[,V2,V3,V4,v'^,VG,V7,vs,Vg define the paths 
774 and 773 respectively. Cf* and are the two connected components of 773 n 7r4. In 
the figure at the top, the vertices V4,V3,Vg,v'^ define a cycle. After removing the edge 
(n4,7;5) from the only segment in £^(774) \ £{773) between and C®, we get an acyclic 
graph displayed at the bottom. 


Proof. Suppose the graph 7r3 n 7 r4 has exactly k + 1 (connected) components namely Cf, • • • , 
Notice that k is nonnegative as 7r3 H 7r4 / 0. Since |£^(7r3 n 774 )! = j and 7r3 H 7r4 is acyclic with k + 1 
components, we have that \V {773 n 7r4)| = j + k + 1. Now suppose it were shown that 773 U 7r4 can 
be made acyclic by removing at most k edges while keeping the vertex set same and call this new 
graph as H. One would then have, 

\V[H n (tti U 7r2)) I > \E(^H n (vTi U 7r2)) I + 1 > |i?((vr3 U 7r4) 0 (tti U 7r2)) | — A: + 1 = / — A: + 1. 
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Adding this to | V(tts n7r4)| = j + k + l would immediately give (3.18). In the remaining part of this 


proof we will show that one can remove k edges from vrs U 7r4 so that the resulting graph becomes 
acyclic. 

Let C be a cycle in vrs U 7r4. Since vrs and 7r4 are acyclic, C consists of an alternating sequence 
of segments in \ and E{'k^) \ EItt^ interspersed with segments in any one of the Cf’s 

(possibly trivial i.e. consisting of a single vertex). This implies that for some 1 < i,i' < k + 1, 
C contains a (nontrivial i.e. of positive length) segment in E{ir 4 ) \ E{'k^) joining and C)?. In 
fact i ^ i' since 7r4 is acyclic. Hence the only case we need to consider is when k > 1. As 7r4 
is a path, Cf, • • • , are vertex-disjoint segments (possibly trivial) aligned along 7r4 in 

some order with k intervening (nontrivial) segments in EItt/^) \ E{'k^). Pick one edge from each of 
these k segments. It follows from the discussions so far that C must contain one of these edges. 
Consequently removing these k edges from tts U 7r4 would make the resulting graph acyclic. We 
refer the readers to Figure for an illustration. □ 


We will now use ( 3.18[ ) and Lemma 3.7 to show that and concentrate around their 
expected values. 


Lemma 3.9. Assume the same conditions on Ci)C 2 ; ^ cLnd r/ as in Lemma 3.1. Then there exists 
9 i,r) = 9 e,ri;C,i ,(2 ■ ^ [0; oo) depending on 1,7] (and ( 1 ,( 2 ) with —)• 0 os n —)> 00 such that the 

following hold: 

(1) P(|A^* — EA^*| < ( 7 £^,j(n)EA^*) —)■ 1 as n ^ 00 ; 

(2) E(| A^ — EA^I < g'^^^(n)EA^) —)• 1 as n ^ 00 . 


Proof. The proof of (1) is rather straightforward. By (3.11) and (3.13) we see that 

e((a;)2) < (ea;)2(i + o(i)). 

An application of Markov’s inequality then yields Part (1). In order to prove Part (2), we first 


argue that EA^ = Q(n). Similar to the computation of (2.2), we can show that EA^ is 0(n). But 
then (3.17) tell us that same is also true for EA^. For the lower bound, notice that given any path 
TTi in n^, there are 0(n^) many paths in n| that intersect tti in exactly one vertex. Furthermore 
for any such pair (7ri,7r2) we have 


= (P(G^))^ = 0(n-2^), 


where the last equality follows from (2.1) (see the computation in (2.2)) and Lemma 3.2 Therefore, 
we obtain that 


EA; = 0 (n''+^)E. 2 en*„/(G.i > 0(n^+^)0(n^)0(n-^^) = 0(n). 


Next we estimate E((A£)^). For this purpose, we hrst consider two fixed '7ri,7r2 G H^ such that 
P(7ri) n P(7r2) 7 ^ 0. For 0 < J < £ and 1 < / < 2.^ — J, let H^f^ be the collection of all pairs 
of paths (vr3,7r4) G H^ such that |£'(7ri U 7r2) n E(7r3 U 7r4)| = j' and |A(7r3 U 7r4)| = 2£ — j. For 
(7r3,7r4) G we see that | A(7r3 U 7r4) \ A(7ri U 7r2) | =21 — j — j' and thus by a similar reasoning 

as employed in (2.3) we get 


Now let n^’i ,i 2 (’^i) ■'^ 2 ) ^ iilf ;^ contain all the pairs (7r3,7r4) such that |P(vr3)nlL(7r4)| = ni > 1 and 
|l/(7r3 U 7r4) n V (vTi U 7r2)| = n 2 . Then |P(7r3 U 7r4) \ V (tti U 7r2)| = 2.^ -|- 2 — ni — 712 and consequently 
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(7r3,7r4) e nlflij ( 711 , 77 - 2 ). Therefore, 


By 


Lemma 


3.8 


we know that for 771+772 > J + / + 2 for 


~ ^l<m.n2<2£+2E(^3^^4)gn"^J+ (ni,n2)'^*^^^3’’"4|.f^7ri,7r2) “ 0 ( 1 ) ■ 

This implies that 

772),(7r3,7r4)^(.^7ri,71-2 .^773,774) = 0 ( 1 )EA/£ . 

where the sum is over all such pairs such that |£^(vri U 712 ) H E{'k^ U 774 )! 7^ 1^ addition, 

^(771,772),(773,774)^(-^771,772 LI = (l + o(l))(EA^£) . 

where the sum is over all such pairs such that |i?(7ri U 712 ) H E^tt^ + 774 )! = 0 (thus in this case 
i/(^i 772 ) is independent of Combined with the fact that EA^^ = ©(n), it gives that 

= (1 + 0 ( 1 ))(EA^^)^. At this point, another application of Markov’s inequality completes 
the proof of the lemma. □ 


We are now well-equipped to prove the main lemma of this subsection. For convenience of 
notation, write 

= fc2i^,v) = e“^°°°^"/^\/7)3/£7. (3.19) 


Lemma 3.10. Assume the same conditions on ( 1 ,( 2 , ^ o,nd rj as in Lemma 3.1. Let Sn,r],e. = 
'S'n,j?,£;Ci,+ ® maximum cardinality among all subsets of Ll^ containing only pairwise 

disjoint good paths. Then there exists an absolute constant Cq > 0 such that, 


^i\Sn,v,i\ > Cefii, r])n) 1 as n ^ 00 . 


(3.20) 


Proof. Let h{i,ri) = f By Lemma 3.9 and (3.17), we assume without loss of 

generality that 

|A^;-EAr;| < 5 ,,^(n)EA^; and A^^ < (1 + o(l))/7(£,7?)^5)^(1 + ^^,^( 77 )), 


where gi,rj{n) is defined as in Lemma 3.9. Since N'^ = 2|£'(^n)| + \y{Qn)\-, by Lemma 3.5 we get 


that the graph Qn has an independent subset of size at least 

iVf/iV;> 77(1 + 0(l))//7(^, 7 ?). 


Therefore, with high probability \Sn,n,i\ > n/2h{i,ri) which leads to (3.20) for Cq = 1/2(75. D 


3.2 Connecting short light paths into a long one 

We set (/i = 1/5 and C 2 = 1 + y^2(7*/e in this subsection. Note that this choice satishes the 
conditions in Lemma 3.10 Denote by Sn,r],e th® event {\Sn,ri/\ > CQf{£,r])n)}. 

The remaining part of our scheme is to connect a fraction of these disjoint good paths in a suitable 
way to form a light and long path 7 . In order to describe our algorithm for the construction of 7 , 
we need a few more notations. Denote the vertex sets VfW*) and VfWn) \ C(W*) by Vi and V 2 
respectively. Let <5 > 0 be a number and 17 > 0 be an integer satisfying 


1 < 5n/l < and bnvji < \V 2 \. (3-21) 

Now label the paths in Sn,r),i as tti, 772 ,... in some arbitrary way. Our aim is to build up the path 
7 in step-by-step fashion starting from tti. In each step we will connect 7 to some ttj by a path 
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of length 2 whose middle vertex is in V 2 . These paths will be referred to as bridges. To leverage 
additional flexibility we also demarcate two segments of length [^/4J one on each end of the paths 
TTj’s which we call end segments. These end segments will allow us to “choose” endpoints of tTj’s 
while connecting them (as such, it is possible that we only keep half of the vertices of itj in 7 ). A 
vertex v will be said to be adjacent to a path or an edge if it is an endpoint of that path or edge. If 
an edge e has exactly one endpoint in S, we denote that endpoint by Ve^s- The following algorithm, 
referred to as BRIDGE(i^,(5), will construct a long path 7 . See Figure]^ for an illustration. 

Initialization. 7 = tti, T is the set of all vertices which are in end segments of vr^’s for j > 2, 
M = V 2 ) F = 0 and designate an end segment of 7 as the open end 70 • Also let v be the endpoint 
of 7 not in 70 . 

Now repeat the following sequence of steps [dn/t'J — 1 times: 

Step 1. Repeat v times: find the lightest edge e between 70 and M, remove Ve^M from M and 
include it in P. These v edges will be called predecessor edges (so at the end of this step, \P\ = ly). 

Step 2. Find the lightest edge between P and T. Call it e'. Then v^'^t comes from an end 
segment of some path in Sn,n,e, say vr. 

Step 3. The edge P and the unique predecessor edge adjacent to Ve',p defines a path b of length 
2 (so b connects a vertex in 70 to a vertex in vr). Let w be the endpoint of vr not in the end segment 
that Ue',T came from. Then there is a unique path 7 ' in the tree 7 U 6 U tt between v and w. Set 
7 = 7 ' and 7 o = the end segment of tt containing w. 

Step 4. Remove the vertices on the end segments of tt from T and reset P at 0. 


Notice that the conditions in (3.21) ensure that we never run out of vertices in T or M during first 


\_5n/i\ — 1 iterations of steps 1 to 4. Thus what we described above is a valid algorithm for such 
choices of 5 and v. Denote the length and average weight of the path 7 generated by BRIDGE(i/, £, 5) 
as Tbridge(z^, d) and Abridge(j^) d) respectively when 6 , it, i satisfy these inequalities. For sake 
of completeness we may define these quantities to be 0 and 00 respectively and regard the output 
path 7 as “empty” if any one of the inequalities in (3.21) fails to hold. We are now just one lemma 


short of proving the lower bound in ( 1 . 1 ). 


Lemma 3.11. For any 0 < r] < r ]2 where 72 > 0 is an absolute constant there exist positive integers 
V = ^{r]), i = i{r]) > Cilh o,nd a positive number 5 = d{rj) such that 

^{LBRiDGE{i^,ll,d) > and ABRiDGE{E,i,d) < l/e + 12 r/ | Tn,??/) —^ 1 

as n tends to infinity. Here Cj > 0 is an absolute constant. 

Proof. We will omit the phrase “conditioned on £n,r]/” while talking about probabilities in this 
proof (barring formal expressions) although that is to be implicitly assumed. We use Fxp(l/0) 
to denote the distribution of an exponential random variable with mean 0 > 0. Define Bn^n,u,i,5 
to be the event that the total weight of bridges does not exceed 3£r/ x Notice that if any 

one of the inequalities in (3.21) does not hold, 7 is “empty” and hence 13n,rj,u,i,s is a sure event. 


Suppose d,iT and i are such that (3.21) is satisfied. We will first bound the average weight ^( 7 ) 
of 7 assuming that Bn,rj,u,i,5 occurs. Let ii be the length of the segment selected by the algorithm 
in the f-th iteration. We see that its weight can be no more than Xii + ‘IQij\Jfii since the segment 
is chosen from a good path of average weight at most A and maximum deviation from its linear 


interpolation is at most Qij (see (3.1) as well as the proof for Lemma 3.6). Thus the total weight 
of edges in 7 from the good paths is bounded by AL + \dnl 1\.{2Q2.I\fh) where L = Adding 

this to the total weight of bridges we get with probability tending to 1 as n —>• 00 


1 F( 7 ) < L(l/e + r?) + \dn/l\ ■ ( 2 C 2 /\A?) + \dn/l\Mri. 
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Figure 3 - Illustrating an iteration of BRIDGE for v = 2 and £ = 4. The edges 
e' and e" define the path h. So in this iteration the paths 7 and tt are shortened slightly 
before being joined via h. 


Since the algorithm selects at least £/2 edges from each of the \8nji\ good paths it connects, we 
have ii > £/2 for each i and thus L > \_6n/l\ x £/2. Therefore, 

^( 7 ) < I/q + t] + [5n/l\.{2C2/Ly/^) + \_5n/l\Mri/L 

< 1 /e + 7 + 4 C 2/'^\/7 + 67 . 

If ^ ^ then from the last display we can conclude ^( 7 ) < 1 /e + 127y. We can assume this 

restriction on ^ for now. Indeed, later we will specify the value of £ and it will satisfy the condition 

£ > C2h^i^. 

So it remains to find positive numbers 5, u, £ as functions of 7 and an absolute constant 72 > 0 
such that the following three hold for all 0 < 7 < 72 : (a) \ £n,r]/) —)■ 1 as n —)• 00 , (b) 

£ > ( 2 / 7 ^^^ V Ci/v (see the statement of the lemma as well as the last paragraph) and (c) 7 has 
the desired length. In the next paragraph we will find a triplet { 6 , v, £) and an absolute constant 
72 > 0 such that (a) holds for 0 < r] < r] 2 . In the final paragraph we will show that our choice of 
(h, 12 , i) also satisfies (b) and (c) whenever 0 < 7 < 72 where 72 < 72 is an absolute constant. 

Let us begin with the crucial observation that, at the start of each iteration the edges between 
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M and 70 are still unexplored. The same is true for the edges between P and T at the end of Step 1 
in any iteration. Consequently their weights are i.i.d. Exp(l/n) regardless of the outcomes from the 
previous iterations. Therefore, all the bridge weights are independent of each other. Now suppose 
the mean and variance of each bridge weight can be bounded above by 2£r] and respectively and 
we emphasize that the latter does not depend on n. By Markov’s inequality it would then follow 
that lim„^oo ^iSn,ri,u,e,5 I £n,ri,e) = 1- To that end let us consider the bridge obtained from the m-th 
iteration where 1 < m < [5n/i\ — 1. Note that here we implicitly assume (3.21), but this would be 


shortly shown to be implied by some other constraints involving 6, v and 1. Let e' be the lightest 
edge between P, T in Step 2 and e be the predecessor edge adjacent to e' (for this iteration). So 
the bridge weight is simply We' + fTe. By discussions on independence at the beginning of the 
proof, it follows that W^' and We are independent of each other and also of the weights of bridges 
already chosen. Since these weights are minima of some collections of i.i.d. exponentials, they will 
be of small magnitude provided that we are minimizing over a large collection of exponentials, i.e., 
|T|, \M\ and v are big. It follows from the description of the algorithm that at each iteration we 
lose 2[t'/4j many vertices from T and v many vertices from M. By simple arithmetic we then get, 


|T| > Ce[e/4:\f{£,r])n and |M| > Ci, 


for all 1 < m < [(5n/£J — 1 provided 


^ < C6f{i, ri)£/2 and ud/i < Ci ^/2 • 


(3.22) 


(3.23) 


Notice that these inequalities automatically imply 6n/i < C%f{£^rj)n and bnujl < \V 2 \- Thus if 


satisfy (3.25), (3.21) would also be satisfied for all large n (given 5,t}. Assume for now that 
(3.23) holds. Since Wg' is minimum of x |T| many independent Exp(l/n 
it is distributed as Exp(z/|T|/n). As for W 
predecessor edges. Erom properties of exponential distributions and description of the algorithm it 
is not difficult to see that this maximum weight is distributed as Ei + E 2 + • • • where Ej+i is 


random variables, 
e, it is bounded by the maximum weight of the v 


exponential with rate (|M| — i) x 1/n x [^/4J. By (3.22), we can then bound the expected weight 
of the bridge from above by 


C6ie/4if{e,v)n 


X ^ X n 


+ 


< 


~ C(iutf{e,rii) T (ir]l 


111 / 


(3.24) 


where the last inequality holds for ^ > 20 and large n (given rj, ly). By the same line of arguments, 
we get that the its variance is bounded by a number that depends only on 7 ,1 and v (so in particular 


independent of n). To make the right hand side of (3.24) bounded above by 2 ^ 7 , we may require 


each of the summands in (3.24) to be bounded by £r\. After a little simplification this amounts to 


V > 5/Cei^r]f{£,r]), and > lliy. 


(3.25) 


So we need to pick a positive <5 = 6{r]), positive integers v = i'{r]),£ = £{ 7 ]) and an absolute constant 
r /2 > 0 such that (3.23) and (3.25) hold for 0 < 7 < r] 2 . We will deal with (3.25) first which is in 
fact equivalent to 

Ci(^7)Vll > > 5/CG£y{£, T,) . (3.26) 

Let us try to find an integer £ satsfying C,i{£'q)‘^/11 > (lO/C' 6 ^^/(^, 7 )) V 2 since this will ensure the 
existence of a positive integer v such that v^£ satisfy ( |3.26[ ). Using f{£,r]) = -s//P, 

we get that this amounts to 

p ^ Cj.e2000C2/7^7 ^ 

— rp r] ’ 
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for some positive, absolute constants Cj and C^. Hence there exists an absolute constant rj 2 > 0 


such that the integers I = and v = [Ci(£r?)^/llJ satisfy (3.26) whenever 0 < rj < 


Now we need to find 6 that would satisfy (3.23) which can be rewritten as, 


n'" 
U2 • 


6<(Cef(£,rj)£/2)A(Cirj£/2iy). 

Again substituting f{£,r]) = ^ we can simplify ( |3.27[ ) to 

5 < A {Ciri£/ 2 u). 

Since v = , (3.28) would be satisfied if 


(3.27) 


(3.28) 


5<(ae-'“°^^/v/^g;;)A(ll/2£r?). 


2 ^ 5/2 , 


The last display together with our particular choice of £ i.e. imply that there exists a 


positive, absolute constant Vj'^ < V 12 such that <5 = e '^oooC 2 /v^ satisfies (3.27) for 0 < r/ < r]'^. Thus 


our choice of the triplet {6,i',£) satisfies (3.23) and (3.25) for 0 < r/ < 7^2 and consequently the 


event Bn,n,u,e,5 occurs with high probability for this choice. 

As to the constraint on £, it is also clear that there exists a positive, absolute constant 773 < 772 
such that £ = is larger than Cilv all 0 < 77 < 772. Finally it is left to ensure 

whether 7 has the length required by the lemma. Since our particular choice of the triplet ((5, v, £) 


satisfies ( 3 . 21 ) for large n (given 77), we have that Abridge( j^j <^) > ldn/£\ x £/2. It then follows 
that there exists a positive, absolute constant 772 < 772 such that Abridge( z^j £, 5 ) > fQj- 

these particular choices of £ and 6 whenever 0 < 77 < 772 and n is large (given 77) . This completes 
the proof of the lemma. 

□ 


Combining Lemmas 3.10 and 3.11 completes the proof of the lower bound in Theorem [m 
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